Customized Vector Instruction Set Architecture
نویسندگان
چکیده
This paper presents a methodology for synthesizing customized vector ISAs for various application domains targeting high performance execution. A number of applications from the telecommunication and linear algebra domains have been studied, and custom vector instructions sets have been synthesized. Three algorithms that compute the shortest paths in a directed graph (Dijkstra, Floyd and Bellman-Ford) have been analyzed, along with the widely used Linpack floating point benchmark. The framework used to customize the ISAs included the use of the Gnu C Compiler versions 4.1.2 and 2.7.2.3 and the SimpleScalar-3.0d tool set extended to simulate customized vector units. The modifications applied to the simulator include the addition of a vector register file, vector functional units and specific vector instructions. The main results can be summarized as follows: overall applications speedups of 24.88X for Dijkstra (after both code optimization and vectorization), 4.99X for Floyd, 9.27X for Bellman-Ford and 4.33X for the C version of Linpack. The above results suggest a consistent improvement in execution times due to the customized vector instruction sets. Keywords—Vector processors, DLP, ISA, Vector Architecture, Performance
منابع مشابه
Customizing Vector Instruction Set Architectures
Data Level Parallelism(DLP) can be exploited in order to improve the performance of processors for certain workload types. There are two main application fields that rely on DLP, multimedia and scientific computing. Most of the existing multimedia vector extensions use sub-word parallelism and wide data paths for processing independent, mainly integer, values in parallel. On the other hand, cla...
متن کاملInstruction Set Architecture Abstraction
This technical report describes CHERI ISAv3, the third version of the This report describes the CHERI Instruction-Set Architecture (ISA) and design. The purpose of this tutorial was to introduce the computer architecture Pydgin is a framework for rapidly developing instruction-set simulators (ISSs) from a but is particularly well-suited for exploring the hardware/software abstraction. The Intel...
متن کاملExploring NISC Architectures for Matrix Application
The paper presents the design of target NISC (No Instruction Set Computer) architecture for matrix application in a C based design flow. It starts with the implementation of a standard application program which generates customized designs using the NISC toolset. Further, it demonstrates and analyzes the compilation and simulation results of several matrix applications on a number of different ...
متن کاملC-slow Technique vs Multiprocessor in designing Low Area Customized Instruction set Processor for Embedded Applications
The demand for high performance embedded processors, for consumer electronics, is rapidly increasing for the past few years. Many of these embedded processors depend upon custom built Instruction Ser Architecture (ISA) such as game processor (GPU), multimedia processors, DSP processors etc. Primary requirement for consumer electronic industry is low cost with high performance and low power cons...
متن کاملInstruction Set Architecture Extensions For A Dynamic Task Scheduling Unit
In this paper a dynamic task scheduling unit for many-core systems The CoreManager instruction set extensions are developed with the tool flow. An instruction set architecture extension for supporting fine-grain thread scheduling and execution is proposed. Task level parallelization is supported by various programming models core and the Distributed Thread Scheduling Unit (DTSU), one per node. ...
متن کامل